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Generation of Specific Binding Partners Binding to (Poly)Peptides 
Encoded by Genomic DNA Fragments or ESTs 

The present invention relates to the generation of specific binding partners binding to 
(poly)peptides encoded by genomic DNA fragments or ESTs. The (poly)peptides are 
expressed as part of fusion proteins which are forming inclusion bodies on expression in host 
cells. The inclusion bodies are used to generate binding partners which bind specifically to 
said (poly)peptides. The specific binding partners, in particular immunoglobulins or 
fragments thereof, are useful for analysis and functional characterisation of proteins encoded 
by nucleic acid sequences comprising the corresponding genomic DNA fragments or ESTs. 
The invention further relates to nucleic acid molecules, vectors and host cells to be used in the 
methods of the present invention. 

The invention further relates to the use of fusion proteins comprising the first N-terminal 
domain of the genelll protein of filamentous phage as fusion partner for the expression of a 
(poly)peptide/protein fused to said fusion partner, and to methods for the expression of 
(poly)peptide/proteins. 

Since several years, massive efforts are being undertaken to sequence the human genome, and 
to identify and characterise structure and function of the proteins encoded therein. Finally, 
this will lead to novel targets for prevention, diagnosis and therapy of diseases (Collins & 
Galas, 1993; Adams etal., 1995). 

Currently, two different approaches are being pursued for identifying and characterising the 
genes distributed along the human genome. In one approach, large fragments of genomic 
DNA are isolated, cloned, and sequenced. Potential open reading frames in these genomic 
sequences are identified using bioinformatics software. However, this approach entails 
sequencing large stretches of human DNA which do not encode proteins in order to find the 
protein encoding sequences scattered throughout the genome. In addition to requiring 
extensive sequencing, the bioinformatics software may mischaracterize the genomic 
sequences obtained. Thus, the software may produce false positives in which non-coding 
DNA is mischaracterised as coding DNA or false negatives in which coding DNA is 
mislabelled as non-coding DNA. 

In an alternative approach, complementary DNAs (cDNAs) are synthesised from isolated 
messenger RNAs (mRNAs) which encode human proteins. Using this approach, sequencing 



2 

is only performed on DNA which is derived from protein coding sequences of the genome. 
Often, only short stretches of the cDNAs are sequenced to obtain sequences called expressed 
sequence tags (ESTs) (WO93/00353). 

In principle, the ESTs may then be used to isolate or purify extended cDNAs which include 
sequences adjacent to the EST sequences. These extended cDNAs may contain portions or the 
full coding sequence of the gene from which the EST was derived. 

By analysing the genomic DNA or fragments thereof, ESTs, extended cDNAs, and/or the 
(poly)peptides/proteins encoded thereby, in certain cases, where homology, structural motifs 
etc. can be identified, it may be possible to assign a function to the (poly)peptide/protein 
which can be tested or verified in vitro or in vivo . However, the various EST-sequencing 
efforts have led to enormous numbers of ESTs, and to the problem how best to structure that 
information and how to identify interesting sequences. Hence, there is still a need for 
developing and using research tools directed against the (poly)peptide/protein of interest to 
analyse their localisation on cell and tissue types, their up- or down-regulation in certain 
disease or development stages or their role in activating or blocking certain interactions or 
signalling routes. 

One approach is to use antibodies or fragments thereof as such research tools. In 
WO93/00353 it was suggested to express the ESTs and to generate antibodies by immunising 
animals with the corresponding (polypeptides. In a similar approach, DNA constructs 
comprising EST sequences have been injected into animals to generate an immune response 
against the (poly)peptide expressed in vivo (Sykes & Johnston, 1999). However, these 
approaches are not amenable to a high-throughput generation of antibodies. 
Alternatively, antibodies are generated against sets of overlapping peptides covering the EST 
sequence (Persic et al., 1999). In combination with screening recombinant antibody libraries, 
this approach can in principle be developed to generate antibody fragments as research tools 
with high throughput. However, it is often difficult to obtain anti-peptide antibodies with 
sufficiently high affinities. 

Thus the technical problem underlying the present invention is to provide a generally 
applicable method for the generation of specific binding partners binding to (poly)peptides 
encoded by genomic DNA fragments or by ESTs, especially of antibodies or antibody 
fragments, for analysis and functional characterisation of proteins corresponding to genomic 
DNA or ESTs. The solution to the above technical problem is achieved by providing the 
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embodiments characterised in the claims. The technical approach of the present invention, to 
provide (poly)peptides encoded by genomic DNA fragments or ESTs for the generation of 
specific binding partners, such as antibodies or antibody-derived products, by expressing the 
(poly)peptides as fusions with (poly)peptide/protein fusion partners which lead to the 
formation of inclusion bodies on expression in host cells, such as E. coli % and to generate 
specific binding partners against the inclusion bodies and fusion proteins, obtainable 
therefrom, is neither provided nor suggested by the prior art. 

A further problem related to the present invention was to devise a method for the expression 
of (poly)peptide/proteins which are not easily expressed in free form, e.g. since they are toxic 
to the host cell. The solution to that technical problem is also achieved by providing the 
embodiments characterised in the claims. The technical approach of the present invention, 
express the (poly)peptide/proteins as fusion proteins comprising the first N-terminal domain 
of the genelll protein of filamentous phage leading to the formation of inclusion bodies, is 
neither provided nor suggested by the prior art. 

Thus, the present invention relates to a method for generating a specific binding partner to a 
(poly)peptide which is encoded by a nucleic acid sequence comprised in a genomic 
DNA fragment or an expressed sequence tag (EST) comprising: 

a) expressing a nucleic acid molecule encoding a fusion protein in a host cell under 
conditions that allow the formation of inclusion bodies comprising said fusion 
protein, wherein said fusion protein comprises 

aa) a (poly)peptide/protein fusion partner which is deposited in inclusion 
bodies when expressed in said host cell under said conditions and 

ab) said (poly)peptide; 

b) isolating said inclusion bodies; and 

c) generating a specific binding partner that binds specifically to said (poly)peptide. 

In the context of the present invention, a "specific binding partner" is a molecule which is able 
to specifically bind to a (poly)peptide of interest. Such a specific binding partner may be a 
peptide, a constrained peptide, an immunoglobulin or fragment thereof, or a cognate binding 
partner of a naturally occurring protein, e.g. a ligand to a receptor which comprises the 
(poly)peptide of interest. Such cognate ligand may be obtainable by screening a cDNA 
expression library for binding to the fusion protein of the present invention. The specific 
binding partner may also be a non-proteinaceous specific binding partner such as a small 



molecule, e.g. obtainable by screening of a combinatorial library of small molecules. A 
specific binding partner may further be modified to enable the detection of an interaction of a 
specific binding partner and the corresponding (poly)peptide. Such modification may be a 
detection and/or purification tag (Hochuli et aL, 1988; Lindner et al, 1992; Hopp et al., 1988; 
Prickett et al., 1989; Knappik & Pluckthun, 1994), or an enzyme (Blake et al., 1984) or a 
reporter molecule fused or coupled to the specific binding partner. 
In the context of the present invention, the term "(polypeptide" relates to molecules 
consisting of one or more chains of multiple, i. e. two or more, amino acids linked via peptide 
bonds. 

The term "protein" refers to (poly)peptides where at least part of the (poly)peptide has or is 
able to acquire a defined three-dimensional arrangement by forming secondary, tertiary, or 
quaternary structures within and/or between its (poly)peptide chain(s). This definition 
comprises proteins such as naturally occurring or at least partially artificial proteins, as well 
as fragments or domains of whole proteins, as long as these fragments or domains have a 
defined three-dimensional arrangement as described above. 

The term "genomic DNA fragment" refers to a contiguous nucleic acid sequence forming part 
of the genome of an organism and being obtained or obtainable therefrom. 
The term "expressed sequence tags (ESTs)" are contiguous DNA sequences obtained by 
sequencing stretches of cDNAs. 

According to the present invention, such a genomic DNA fragment or EST comprises a 
nucleic acid sequence which encodes a (polypeptide or consists of a putative open reading 
frame (ORF). 

The EST databases (Eckmann et al., 1998; Bouck et al., 1999) often contain sequences of low 
sequence quality (Aaronson et al., 1996). One of ordinary skill in the art will be able to 
identify at least one putative ORFs in a given genomic DNA fragment or EST sequence, and 
it will not constitute an undue burden for the person skilled in the art to clone all ORFs 
identified in that way for the expression of a corresponding set of said fusion proteins, and to 
use them according to the present invention. 

The length of the genomic DNA fragment or EST is preferably between 100 and 2000 base 
pairs, more preferably between 200 and 1 500 base pairs. 

The nucleic acid molecule encoding a fusion protein used according to the present invention, 
or an appropriate vector comprising said nucleic acid molecule, further comprises non-coding 
DNA sequences which are required to cause or allow the expression of the fusion protein. 
Methods for construction of nucleic acid molecules encoding a fusion protein used according 
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to the present invention, for construction of vectors comprising said nucleic acid molecules, 
for introduction of said vectors into appropriately chosen host cells, for causing or achieving 
the expression of said fusion proteins are well-known in the art (see, e.g., Sambrook et al., 
1989; Ausubel et al., 1994). 

The formation of inclusion bodies can be observed in several host systems in the course of the 
expression of a (poly)peptide/protein. Inclusion bodies are insoluble aggregates of 
(poly)peptide/protein deposited within a host cell. They are very dense particles which exhibit 
an amorphous or paracrystalline structure independent of their subcellular location. Under 
appropriate conditions the recombinant (poly)peptide/protein deposited in inclusion bodies 
amounts to about 50% or more of the total cell protein. The formation of inclusion bodies, and 
their properties, and applications thereof have been investigated in detail (see, for example, 
Rudolph, 1996; Rudolph & Lilie, 1996; Rudolph et al., 1997; Lilie et al., 1998). Methods of 
purifying inclusion bodies have been described therein as well and are well-known to one of 
ordinary skill in the art. 

The use of inclusion body formation formed by expression of fusion proteins comprising a 
fusion partner and a (poly)peptide/protein as a general means of expressing said 
(poly)peptide/protein has been described (WO 98/30684). 

A fusion partner suitable for a method according to the present invention may be any 
(poly)peptide/protein which can be found in inclusion bodies when expression in a host cell. 
In most cases, inclusion body formation is a consequence of high expression rates, regardless 
of the system or protein used. There seems to be no correlation between the propensity of 
inclusion body formation of a certain protein and its intrinsic properties, such as molecular 
weight, hydrophobicity, folding pathways, and so on. (Poly)peptides/proteins where inclusion 
body formation has been observed and which, therefore, are suitable candidates to be used as 
fusion partners according to the present invention, include, but are not limited to, E. coli 
proteins such as maltose-binding protein (Betton & Hofhung, 1996), RNAse II (Coburn & 
Mackie, 1996), alkaline phosphatase (Derman & Beckwith, 1995), phosholipase A (Dekker et 
al., 1995), B-lactamase (Rinas & Bailey, 1993), (hioredoxin (Hoog, et al., 1984; WO 
98/30684), and non E. coli proteins such as human procathepsin B (Kuhelj et al., 1995), 
porcine interferon-y (Vandenbroeck et al., 1993), or T5 DNA polymerase (Chatterjee et al., 
1991). 

The host referred to above may be any of a number commonly used in the production of 
proteins, including but not limited to bacteria, such as E coli (see. e.g., Ge et al, 1995) or 
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Bacillus subtilis (Wu et al., 1993); fungi, such as yeasts (Horwitz et al., 1988; Ridder et al., 
1995) or filamentous fungus (Nyyssonen et al., 1993); plant cells (Hiatt, 1990, Hiatt & Ma, 
1993; Whitelam et al., 1994); insect cells (Potter et al., 1993; Ward et al., 1995), or 
mammalian cells (Trill et al., 1995). 

The generation, and optionally, identification, of "a binding partner that binds specifically to 
said (polypeptide" can be achieved by using a variety of methods, depending on the type of 
specific binding partner, which are well-known to one of ordinary skill in the art. For 
example, combinatorial libraries of chemical compounds, peptides or biomolecules, such as 
immunoglobulins, can be screened and/or selected against the isolated inclusion body as 
target, preferably after purification, or, more preferably, against the fusion protein obtained 
from said inclusion bodies, either in solubilised or in refolded form, or against the free 
(poly)peptide as target (see, for example: http://www.5z.com/divinfo/reviews.html; Pinilla et 
al., 1999; Woodbury & Venton, 1999; Borman, 1999; Eisele etal., 1999; Lebl, 1999). 

In a preferred embodiment of the method of the invention, said fusion protein comprises said 
fusion partner as N-terminal portion and said (polypeptide as C-terminal portion. 
Further preferred is a method, wherein said fusion protein further comprises a (polypeptide 
linker linking said fusion partner and said (polypeptide. 

The linker may consist of about 1 to about 30, preferably of between about 5 and about 15 
amino acids. 

Particularly preferred is a method, wherein, said linker comprises a cleavage signal. 
In the context of the present invention, the term "cleavage signal" refers to a amino acid 
sequences which allows to cleave, e.g. by chemical or enzymatic reactions, the fusion protein 
between said fusion partner and said (polypeptide to be able to obtain said (polypeptide in 
free form. Such cleavage signal is preferably a specific recognition sequence of a protease 
well known to one of ordinary skill in the art, such as enterokinase or thrombin. Alternatively, 
the fusion protein might be cleaved by chemical cleavage with a chemical such as cyanogen 
bromide. 

Said fusion protein may further comprise additional (polypeptide sequences at N- and/or C- 
terminus, and/or in said (polypeptide linker. This comprises, for example, (polypeptides 
which allow to identify and/or purify said fusion protein. Examples for such (polypeptide 
tags are His,, (Hochuli et al., 1988; Lindner et al., 1992), myc, FLAG (Hopp et al., 1988; 
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Prickett et al, 1989; Knappik & Pluckthun, 1994), or a Strep-tag (Schmidt & Skerra, 1993; 
Schmidt & Skerra, 1994; Schmidt et al., 1996). These tags are all well known in the art and 
are fully available to the person skilled in the art. 

In a yet further preferred embodiment of the method of the invention, said genomic DNA 
fragment or said EST is obtained from a prokaryotic organism or from a virus. 
Most preferred is a method wherein said prokaryotic organism or virus is a pathogen. 

By sequencing the genome of organisms pathogenic to human, or pathogenic to animals or 
plants, new proteinaceous targets for prevention, diagnosis and/or therapeutic intervention are 
being sought. 

Further preferred is a method wherein said nucleic acid is expressed under conditions 
allowing over-expression of said fusion protein. 

In a further preferred embodiment, the invention relates to a method wherein said genomic 
DNA fragment or said EST is obtained from a eukaryotic organism. 

In a preferred embodiment, the present invention relates to a method wherein said genomic 
DNA fragment or said EST is obtained from a non-mammalian species. 

Further preferred is a method wherein said genomic DNA fragment or said EST is obtained 
from a mammalian species. 

In a most preferred embodiment the present invention relates to a method wherein said 
mammalian species is human. 

In a preferred embodiment of the method of the invention, said host cell is a eukaryotic cell. 
Particularly preferred is a yeast or insect cell. 

In a most preferred embodiment of the method of the invention, said host cell is a prokaryotic 
cell. Particularly preferred is a bacterial cell. Most preferably, said bacterial cell is an E. coli 
cell. 
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An additional preferred embodiment of the invention relates to a method wherein said fusion 
protein is expressed in the cytosol of a bacterial host cell. 

Particularly preferred is the cytosolic expression of fusion proteins according to the present 
invention wherein said fusion partner contains at least one disulfide bond. 
It has been found that inclusion body formation can be anticipated if a disulfide bonded 
(poly)peptide/protein is produced in the bacterial cytosol, as formation of disulfide bonds 
does usually not occur in this reducing cellular compartment. The consequence is improper 
folding resulting in aggregation (Lilie et al., 1998). 

Further preferred is a method where said fusion partner is a secreted protein, and wherein said 
nucleic acid does not comprise a nucleic acid sequence encoding a signal sequence for the 
transport of the fusion protein to the periplasm. 

It has been observed that cytosolic expression of secreted (poly)peptide/protein leads to the 
formation of inclusion bodies (Lilie et al., 1998). 

In a preferred embodiment the present invention relates to a method wherein said fusion 
partner is an endogenous (poly)peptide/protein of said host cell. 

Most preferred is a method wherein said fusion partner is a (poly)peptide/protein foreign to 
said host cell. 

Particularly preferred is a method wherein said fusion partner is taken from the list of E. coli 
maltose-binding protein, E. coli RNAse II, E. coli alkaline phosphatase, E. coli phosholipase 
A, E. coli 8-lactamase, E. coli thioredoxin, human procathepsin B, porcine interferon, and T5 
DNA polymerase. 

In a further most preferred embodiment of the method of the invention, said host cell is E.coli 
and said fusion partner comprises the first N-terminal domain of the genelll protein of a 
filamentous phage. 

Preferably, said fusion partner consists of the two N-terminal domains of the genelll protein, 

more preferably of the first N-terminal domain of the genelll protein. 

Most preferably, said fusion partner consists of amino acids 1 to 82 of the genelll protein. 

Infection of Escherichia coli by the Ff filamentous phages fl, fd, and Ml 3 is initiated by 
interaction of the genelll protein (g3p) located at one end of the phage particle with the tip of 
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the F conjugative pilus (Model & Russel, 1988). Mature g3p (406 amino acids) consists of 3 
domains separated by linker sequences (Stengele et al. 9 1990; Krebber et al., 1997). The 
following roles could be assigned to the individual domains: The N-terminal domain of g3p 
(Nl) is responsible for membrane penetration (Riechmann & Holliger, 1997), the middle 
domain (N2) for binding of the bacterial F-pilus (Stengele et al , 1 990) and the C-terminal 
domain (CT) plays a role in phage morphogenesis and caps one end of the phage particle 
(Crissman & Smith, 1984). The crystal structure of the two N-terminal domains of g3p (Nl- 
N2) and the solution structure of Nl have been solved (Lubkowski et al., 1998; Holliger & 
Riechmann, 1997). Purified Nl was shown to be highly soluble and monomeric at mM 
concentrations (Holliger & Riechmann, 1997). Expression of Nl or N1-N2 in the cytoplasm 
of E. colu however, leads to the formation of inclusion bodies from which the proteins can be 
refolded (C. Krebber, 1996; Krebber et aL, 1997). Since expression of Nl and N1-N2 fusion 
proteins are toxic to the cells (C. Krebber, 1996), tight regulation of transcription of the fusion 
genes are preferred using for example the pET (Stratagene, La Jolla, CA, USA) or the pBAD 
expression system (Invitrogen BV, Groningen, The Netherlands). The use of these vectors is 
in all cases applicable where toxic effects of gene products is being expected, assumed or 
observed, and is one of the first steps well known to one of ordinary skill in the art in 
adjusting expression conditions. 

Fusion partners comprising the first N-terminal domain of glllp are particularly useful since 
the fusion protein readily form inclusion bodies on cytosolic expression, but are easily 
solubilised (Krebber et al., 1997). 

The fusion partner may also be a variant or a mutant of a parental fusion partner referred to 
hereinabove (such as a (poly)peptide/protein comprising the first N-terminal domain of glllp), 
provided that such variant or mutant is deposited in inclusion bodies as well when expressed 
in host cell under conditions where the parental fusion partner is deposited in inclusion 
bodies. Such variant or mutant may result from the parental fusion partner e.g. by adding, 
substituting and/or deleting one or more amino acid residue(s). Since the formation of 
inclusion bodies on expression is a property which can easily be monitored by one of ordinary 
skill in the art, it does not require an undue burden of experimentation to identify variants or 
mutants with properties suitable for the methods of the present invention. 

In a further preferred embodiment, the invention relates to a method wherein step b) further 
comprises the step of (i) solubilising said fusion protein under suitable conditions. 
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In a yet further preferred embodiment, the present invention relates to a method wherein step 
b) further comprises the step of (ii) refolding said fusion protein under suitable conditions. 

Methods for solubilising and/or refolding (poly)peptides/proteins found deposited in inclusion 
bodies have been thoroughly investigated and are well known to the practitioner of ordinary 
skill in the art (see, for example, Rudolph, 1996; Rudolph & Lilie, 1996; Rudolph et al., 1997; 
Lilie etal., 1998). 

In another preferred embodiment, the invention relates to a method wherein said fusion 
protein further comprises a (poly)peptide linker linking said fusion partner and said 
(poly)peptide, wherein said linker comprises a cleavage signal, and wherein step b) further 
comprises the steps of (iii) cleaving said fusion protein between said fusion partner and said 
(polypeptide, and (iv) isolating said (poly)peptide in free form. 

Further preferred is a method further comprising the step of purifying said fusion protein or 
said (poly)peptide in free form. 

The construction of fusion proteins comprising a cleavage signal which allows to cleave the 
fusion protein between said fusion partner and said (poly)peptide has been described 
hereinabove. 

In a preferred embodiment of the method of the invention, said specific binding partner is an 
immunoglobulin or a fragment thereof. 

In this context, "immunoglobulin" is used as a synonym for "antibody". Immunoglobulin 
fragments according to the present invention may be Fv (Skerra & Pluckthun, 1988), scFv 
(Bird et al, 1988; Huston et al., 1988), disulfide-linked Fv (Glockshuber et al., 1992; 
Brinkmann et al., 1993), Fab, (Fab')2 fragments or other fragments well-known to the 
practitioner skilled in the art, which comprise the variable domain of an immunoglobulin or 
immunoglobulin fragment. 

Particularly preferred is the scFv fragment format. 

In a most preferred embodiment of the method of the invention, said immunoglobulin or 
fragment thereof is generated by (i) immunisation of an animal with said inclusion bodies, 
said fusion protein or said (poly)peptide, and (ii) by selecting an immunoglobulin produced 
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by said animal which specifically binds to said inclusion bodies, said fusion protein or said 
(poly)peptide. 

Methods for immunising animals and for screening and/or selection of specific 
immunoglobulin are well-known to one of ordinary skill in the art. 

In a further most preferred embodiment of the method of the invention, said immunoglobulin 
or fragment thereof is generated by selecting a member of a recombinant library of 
immunoglobulins or fragments thereof which specifically binds to said inclusion bodies, said 
fusion protein or said (poly)peptide. 

Recombinant libraries of immunoglobulins or fragments thereof have been described in 
various publications (see, e.g., Vaughan et al., 1996; Knappik et al., 2000; WO 97/08320), 
and are well-known to one of ordinary skill in the art. 

Particularly preferred is a method wherein said library is displayed on the surface of a 
replicable genetic package. 

The term "replicable genetic package" refers to an entity which combines phenotype and 
genotype of members of a library of (poly)peptides/proteins by linking the genetic 
information encoding the library member and the (poly)peptide/protein expressed therefrom. 
The library can be screened and/or selected for a desired property, and the 
(poly)peptide/protein being screened and/or selected can be identified via the genetic 
information associated with the same. Examples for "replicable genetic packages" comprise 
cells, such as bacteria (WO 90/02809; Georgiou et al., 1993; Francisco & Georgiou, 1994; 
Daugherty et al., 1998), yeast (Boder & Wittrup, 1997; Kieke et al., 1997; Cho et al., 1998; 
Kieke et al., 1999) insect cells (Ernst et al., 1998), viruses, such as bacteriophage (WO 
90/02809; Kay et al., 1996; Dunn, 1996; McGregor, 1996) retroviruses (Russell et al., 1993), 
spores (WO 90/02809), or complexes of nucleic acid molecules and (poly)peptides/proteins 
expressed therefrom, such as in ribosome complexes (Hanes & Pluckthun, 1997; Hanes et al., 
1998; Hanes et al., 1999) or in complexes connected either non-covalently (Cull et al., 1992; 
Schatz, 1993; Schatz et al., 1996; Gates et al., 1996) or covalently (Nemoto et al., 1997). 



Further preferred is a method wherein said replicable genetic package is a filamentous phage. 
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In the context of the present invention, the term " filamentous phage" refers to a class of 
bacteriophage which are able to infect a variety of Gram negative bacteria. They have a 
single-stranded, covalently closed DNA genome which is packaged in a protein coat forming 
a long cylinder. The best characterised of these phage are Ml 3, fd, and fl and derivatives 
thereof. Filamentous phage have been used extensively for the display of foreign 
(poly)peptides/proteins and libraries thereof, and the various approaches and applications 
have been reviewed in several publications (e.g. Kay et al., 1996; Dunn, 1996; McGregor, 
1996). 

Particularly preferred is the use of a fusion protein comprising the N-terminal domain of the 
genelll protein (g3p) of filamentous phage as fusion partner for biopanning of a recombinant 
library of immunoglobulins or fragments thereof displayed on the surface of filamentous 
phage. 

The following properties of Nl make it an especially suitable candidate to be used in 
biopanning of phage display libraries: 

- Nl (amino acids 1 - 82 of the mature g3p) is small and has a low pi of 4.14, which is 
advantageous for coating to conventional micro titer plates used for biopanning which is 
routinely done at physiological pH 

- most phages displaying Nl -binding scFvs on their surface should automatically be 
removed since they should bind to other phages which carry 3-5 copies of g3p comprising 
Nl on their surface. 

In another embodiment, the present invention relates to a nucleic acid molecule encoding a 
fusion protein comprising aa) the first N-terminal domain of the genelll protein of 
filamentous phage and ab) a (poly)peptide which is encoded by a nucleic acid sequence 
comprised in a genomic DNA fragment or an expressed sequence tag (EST), wherein said 
nucleic acid molecule does not comprise a nucleic acid sequence encoding a signal sequence 
for the transport of the fusion protein to the periplasm of a bacterial host cell. 

In a further embodiment, the invention relates to a vector which comprises a nucleic acid 
molecule of the present invention. 



Preferably, said vector is an expression vector. 
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In another embodiment, the invention relates to a host cell comprising a nucleic acid or a 
vector according to the present invention. 

Particularly preferred is a host cell which is an E.coli cell. 

Additionally, the invention relates to the use of a fusion protein comprising the first N- 
terminal domain of the genelll protein of filamentous phage as fusion partner for the 
expression of a (poly)peptide/protein fused to said fusion partner, wherein said fusion protein 
is obtained in the form of inclusion bodies. 

The general method of using inclusion body formation formed by expression of fusion 
proteins comprising a fusion partner and a (poly)peptide/protein as a means of expressing said 
(poly)peptide/protein has been described (WO 98/30684). 

The fusion protein may further comprise a linker sequence linking said fusion partner and said 
(poly)peptide/protein. The linker may consist of about 1 to about 30, preferably of between 
about 5 and about 15 amino acids. The linker may comprise a cleavage signal which allows to 
cleave the fusion protein between the fusion partner and the (poly)peptide/protein to be able 
to obtain said (poly)peptide/protein in free form. Such cleavage signal is preferably a specific 
recognition sequence of a proteases well known to one of ordinary skill in the art, such as 
enterokinase or thrombin. Alternatively, the fusion protein might be cleaved by chemical 
cleavage with a chemical such as cyanogen bromide. 

Such fusion proteins, after refolding, can be used in in vitro SIP as well (Krebber et al., 1997). 

The invention furthermore relates to a method for the expression of a (poly)peptide/protein 
comprising: 

a) expressing a nucleic acid molecule encoding a fusion protein in a host cell under conditions 
that allow the formation of inclusion bodies comprising said fusion protein, wherein said 
fusion protein comprises 

aa) the first N-terminal domain of the genelll protein of filamentous phage, and 

ab) said (poly)peptide/protein. 

Particularly preferred is a method further comprising the steps of 

b) isolating said inclusion bodies; and 

c) solubilising said fusion protein under suitable conditions. 
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The specific binding partners generated according to the present invention may be used for 
the identification and/or characterisation of a naturally occurring (poly)peptide/protein 
comprising said (poly)peptide. 

Such uses include, but are not limited to, the use of specific binding partners such as 
immunoglobulins or fragments thereof in immunoassays such as ELISA, in Western blot 
analysis of cell extracts, immunohistochemistry or immunocytochemistry on tissues or cells, 
immunoprecipitations, immunocoprecipitation using cell extracts, and so on. The use of 
specific binding partners such as immunoglobulins or fragments thereof in such binding 
assays, or in similar methods, and in the isolation of target material is well-known to one of 
ordinary skill in the art. 

By using the specific binding partner generated according to the present invention it will be 
possible to identify and/or characterise naturally occurring (poly)peptide/protein comprising 
said (poly)peptide. 

Methods for isolating naturally occurring (poly)peptides/proteins from natural sources, and 
methods for the identification of these (poly)peptide/protein, either directly or via the genetic 
information encoding these (poly)peptide/protein, are well-known to one of ordinary skill in 
the art. 

Figure legends 
Figure 1: 

(A) Vector map of expression vector pTFT74-Nl-MCS-H. 

(B) Sequence of expression vector pTFT74-Nl-MCS-H. 

Figure 2: 

(A) Vector map of expression vector pTFT74-H-Nl-MCS. 

(B) Sequence of expression vector pTFT74-H-Nl-MCS. 

Figure 3: Expression of fusion protein constructs 

After expression, whole cell lysates were run on a 12% SDS PAA Ready gel (Bio-Rad) under 
reducing conditions. The gel was stained using Coomassie Blue. 
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Lanel, High molecular weight Rainbow marker (Amersham), molecular masses of proteins 
are indicated; 

lane 2, Nl fused to a fragment of an MHC classll beta chain (calculated mass of fusion 
protein: 33.4kD); 

lane 3, Nl fused to a fragment of an MHC classll alpha chain (calculated mass of fusion 
protein: 32.2kD); 

lane 4, Nl fused to the very C-terminal 280 amino acids of human NF-kB pi 00 amplified by 
PCR for cloning into pTFT74-Nl-MCS-H from IMAGE clone 434322 (calculated mass of 
fusion protein: 39.9kD); 

lane 5, Nl fused to mature human ICAM-1 (calculated mass of fusion protein: 65.7kD); 
lane 6, Nl fused to a fragment of human ICAM-1 (amino acids 401 - 480 of the unprocessed 
protein, calculated mass of fusion protein: 19.3kD); 

lane 7, Nl fused to a fragment of human ICAM-1 (amino acids 151 — 532 of the unprocessed 
protein, calculated mass of fusion protein: 52.2kD); 

lane 8, Nl fused to a fragment of UL84 of human cytomegalovirus (amino acids 68 — 586, 
calculated mass of fusion protein: 68.4kD); 

lane 9, Nl fused to a fragment of UL84 of human cytomegalovirus (amino acids 200 - 586, 
calculated mass of fusion protein: 53.2kD); and 

lane 10, Nl fused to a fragment of UL84 of human cytomegalovirus (amino acids 300 - 586, 
calculated mass of fusion protein: 42.2kD) 

Figure 4: Specificity ELISA of 3 different svFvs (clones 1-3) selected against Nl-Macl. 
Preparation of the periplasmic fraction of JM83 cells containing scFv clones 1-3 on an 
expression vector was as described (Knappik et al., 1993). l|ig of Nl-Macl, MacI, Nl-hag, 
Nl and BSA, respectively, in PBS was coated for 12h at 4°C to a Nunc Maxisorb microtiter 
plate (# 442404) which was then blocked for 2h at room temperature using PBS containing 
5% skim milk powder. Periplasmic fractions were mixed 1:1 with PBS containing 5% skim 
milk powder and 0.05% Tween 20 and incubated for lh at room temperature before they were 
added to the blocked wells of the microtiter plate. Incubation was lh at room temperature. 
Since all HuCAL scFvs carry an N-terminal Ml FLAG (Knappik & Pluckthun, 1994), an Ml 
anti-FLAG antibody (Sigma # F-3040) was applied to the wells and incubated for lh at room 
temperature (2 nd antibody). Bound Ml anti-FLAG antibodies were detected with an anti- 
mouse IgG-HRP conjugate (Sigma # A-6782; 3 rd antibody) and BM blue soluble (Boehringer 
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Mannheim # 1484281) as substrate. After blocking and incubation with the periplasmic 
fractions, the Ml anti-FLAG antibody and the anti-mouse IgG-HRP conjugate, the ELISA 
plate was washed 5 times using TBS buffer containing 0.05% Tween 20 and ImM CaCb. 
Absorbance at 370nm was measured after addition of substrate. 

Figure 5: 

(A) Vector map of expression vector pBAD-Nl-MCS-H. 

(B) Sequence of expression vector pBAD-Nl-MCS-H. 

Figure 6: Expression of fusion protein constructs and one step affinity purification. 
Samples were run on a 12% SDS polyacryamide gel (Bio-Rad) under reducing conditions. 
The gel was stained using Coomassie Blue. 

Lane 1, marker proteins with relative molecular masses indicated (to be multiplied by 103); 
lane 2, crude lysate of E. coli BL21(DE3)pLysS harbouring vector pTFT74-Nl-Mad after 3h 
induction with ImM IPTG; 

lane 3, refolded inclusion bodies from Nl-MacI expression; 
lane 4, affinity-purified, refolded Nl-MacI; 

lane 5, crude lysate of E. coli BL21(DE3)(pLysS) harbouring vector pTFT74-Nl-U2 after 3h 

induction with ImM IPTG; 

lane 6, affinity-purified, refolded N1-U2; 

lane 7, crude lysate of E. coli BL21(DE3)(pLysS) harbouring vector pTFT74-Nl-I3 after 3h 

induction with ImM IPTG; 

lane 8, affinity-purified, refolded N1-I3; 

lane 9, crude lysate of E. coli BL21(DE3)(pLysS) harbouring vector pTFT74-Nl-Bl after 3h 

induction with ImM IPTG; 

lane 10, affinity-purified, refolded Nl-Bl. 

Figure 7: Purity of affinity purified, refolded Nl-fusion proteins. 

Samples were run on a 12% SDS polyacryamide gel (Bio-Rad) under reducing conditions. 
The gel was stained using Coomassie Blue. The calculated molecular weight of the fusion 
protein is given in brackets. Lane 1, marker proteins with relative molecular masses indicated 
(to be multiplied by 103); lane 2, Nl-Ulfl (75.6 kDa); lane 3, N1-U2 (68.4 kDa); lane 4, Nl- 
U4 (42.2 kDa); lane 5, Nl-Ilfl (65.7 kDa); lane 6, N1-I3 (19.3 kDa); lane 7, N1-I4 (52.2 
kDa); lane 8, Nl-Bl (33.4 kDa); lane 9, N1-A14 (32.2 kDa); lane 10, Nl-Np50 (51.3 kDa). 
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The example illustrates the invention 
Examples: 

In the following description, all molecular biology experiments are performed according to 
standard protocols (Ausubel et al., 1995). 

Example 1: Functional genomics with phages: Overexpression of Nl fusion proteins, 
purification from inclusion bodies and biopanning of phage display libraries against the 
refolded fusion proteins 

Generation of expression vectors 

All vectors used are derivatives of expression vector pTFT74 (Freund et al., 1993). Into this 
vector, the DNA sequence coding for amino acids 1-82 of mature g3p of phage fd containing 
an additional methionine residue at the N-terminus, a multiple cloning site and a DNA 
sequence coding for a 6xHis purification tag has been inserted between the unique Ncol and 
Hindlll sites generating vector pTFT74-Nl-MCS-H (Figure 1, complete vector sequence 
given in appendix). The first 82 amino acids of the mature g3p contain domain Nl (amino 
acids 1 - 67) and the first 15 amino acids of the linker between Nl and N2 (Lubkowski et al., 
1998). A second vector, pTFT74-H-Nl-MCS, was generated which contains between the 
unique Ncol and Hindlll sites a DNA sequence coding for Met- Ala, a 6xHis purification tag 
and amino acids 2-82 of g3p of phage fd fused to a multiple cloning site and three stop codons 
for all 3 reading frames (Figure 2, complete vector sequence given in appendix). 
Compared to the published sequence, a G to T nucleotide exchange at position 57 has been 
found in vector pTFT74. 

Into vector pTFT74-Nl-MCS-H, DNA fragments generated by PCR or made as an 
oligonucleotide cassette coding for the amino acid sequences given below and in the legend to 
Figure 3 have been cloned either between the unique BsiWI and Hindlll sites or between the 
unique Xbal and EcoRI sites. 

Vector pTFT74-H-Nl-MCS will be used for high throughput cloning of PCR amplified ESTs 
similar to the procedure described by Hua et al. (1998), but introducing appropriate restriction 
sites at 5 C and 3' end during PCR. This way, for oligo dT primed, directionally cloned 
cDNAs, only 4 primers are needed for the amplification of the insert of each cDNA cloning 
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vector (3 forward primers for amplification of EST inserts in three open reading frames and 
one reverse primer corresponding to the downstream sequence of the cDNA cloning vector). 8 
primers are needed for each cDNA cloning vector for the generation of 6 PCR products 
covering all 6 possible reading frames of the insert. 

Expression, purification and refolding of fusion proteins 

Expression, purification and refolding has been done as described (C. Krebber, 1996; Krebber 
et al., 1997). Briefly, BL21(DE3)pLysS cells (Studier et al., 1990) were transformed with the 
respective pTFT74 vector (see below) and grown to an OD550 of 0.9-1 .2. Induction of Nl 
fusion protein expression was for 3 h with ImM IPTG at 37°C. Nl fusion proteins were 
isolated by Ni-NTA chromatography from solubilised inclusion bodies and refolded. Protein 
concentration during refolding was usually <lmg/ml. 
The following constructs have been used: 

- Nl-hag: Nl (amino acids 1-82 of mature g3p of phage fd containing an additional 
methionine residue at the N-terminus) fused to the amino acid sequence 
PYDVPDYASLRSHHHHHH which includes the epitope DVPDYAS from hemagglutinin 
recognised by antibody 17/9 (Schulze-Gahmen et al., 1993; Krebber et al., 1995). Obtainable 
by cloning of an oligonucleotide cassette (made from the following 2 oligonucleotides: 5 C - 
GTACGACGTTCCAGACTACGCTTCCCTGCGTTCCCATCACCATCACCATCACTA-3' 
and 5 ' - AGCTTAGTG ATGGTG ATGGTGATGGGAACGC AGGG AAGCGTAGTCTGGA- 
ACGTC-3') between the BsiWI and Hindlll sites of vector pTFT74-Nl-MCS-H. 

- Nl-MacI: Nl (amino acids 1-82 of mature g3p of phage fd containing an additional 
methionine residue at the N-terminus) fused to the amino acid sequence 

PYGGGSGGGSGSDIAFLIDGSGSIIPHDFRRMKEFVSTVMEQLKKSKTLFSLMQYSEEF 
RIHFTFKEFQNNPNPRSLVKPITQLLGRTHTATGIRKWRELFNITNGARKNAF 
TDGEKFGDPLGYEDVIPEADREGVIRYVIGVGDAFRSEKSRQELNTIASKPPRDHVFQ 
VNNFEALKTIQNQLREKIFAIEGTQTGSSSSFEHEMSQE (which contains amino acids 
149 - 353 of human CR-3 alpha chain (SWISS-PROT entry PI 1215)) and a C-terminal 
sequence containing a 6xHis tag. Obtainable by PCR using cDNA of HL-60 cells as a 
template and oligonucleotides CR-3for (5 ' -GT ACGTACGGGGGCGGCTCTGGTGGTGGT- 
TCTGGT AGTG AC ATTGCCTTCTTG ATTG ATGGC-3 4 ) and CR-3rev (5 C -GTAAAGC- 
TTAGTGATGGTGATGGTGATGTCTACCTTCGATTTCCTGAGACATCTCATGC- 
TCAAAGGAGC-3 '), digest of the PCR product with restriction enzymes BsiWI and Hindlll, 
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and cloning of the fragment between the BsiWI and Hindlll sites of vector pTFT74-Nl-MCS- 
H generating vector pTFT74-Nl-MacI-H. 
- Nl (Krebber etal., 1997) 

For the Nl fusions shown in Figure 3, DNA fragments have been amplified by PCR from 
cDNA clones or from genomic DNA and cloned between the Xbal and EcoRI sites of vector 
pTFT74-Nl-MCS-H. 

For screening of Nl-MacI binders, a purified fragment (MacI) of human CR-3 alpha chain 
(SWISS-PROT entry PI 1215) was used which contains amino acids 149 - 353 of human CR- 
3 alpha fused to a C-terminal sequence containing a 6xHis tag. Obtainable by PCR from clone 
pTFT74-Nl-Mad-H. An ATG codon was added to the 5' end of the gene during cloning. 
Expression and purification was performed using standard methods (The QIAexpressionist™ 
3 rd edition: A handbook for high-level expression and purification of 6xHis-tagged proteins 
(July 1998). QIAGEN GmbH, Hilden, Germany). 

Panning of the HuCAL scFv phage library against Nl-MacI and Nl 

Panning against Nl-MacI and Nl and characterisation of selected scFvs was performed using 
standard procedures (Kay et al., 1996) and the HuCAL scFv library (WO 97/08320). Nl-MacI 
and Nl were coated for 12h at 4°C at a concentration of 10^ig/ml in PBS to Nunc Maxisorb 
microtiter plates (# 442404). In case of Nl-MacI, phages were mixed 1:1 before panning with 
either PBS containing 5% skim milk powder and 0.1% Tween 20 (panning NMa) or PBS 
containing 5% skim milk powder, 0.1% Tween 20 and 0.5mg/ml Nl-hag (panning NMb). In 
case of Nl, phages were mixed 1 : 1 before panning with either PBS containing 5% skim milk 
powder and 0.1% Tween 20 (panning Na) or PBS containing 5% skim milk powder, 0. 1% 
Tween 20 and 0.5mg/ml Nl (panning Nb). Phages were incubated in these buffers for 2h at 
room temperature before they were applied to the ELISA well coated with antigen. 
After 3 rounds of panning, 92 clones from each panning were analysed in ELISA. In pannings 
Na and Nb, no binders against Nl were obtained while in pannings NMa and NMb several 
binders against Nl-MacI were selected. These binders were also tested for binding to MacI. 
Clones which showed a signal of at least 3x above background in ELISA were considered 
positive. 
1. NMa 

Positives against Nl-MacI: 77 
Positives against MacI: 37 
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2. NMb 

Positives against Nl-MacI: 85 
Positives against MacI: 80 

All MacI binders also recognise Nl-MacI. The relatively small amounts of Nl-hag used for 
blocking lead to a 100% increase of the number of MacI binders. There are, however, 
additional N-terminal linker residues in Nl-MacI, so complete blocking of non MacI binders 
using Nl-hag is not possible. 

For some binders a specificity ELISA was performed showing that the selected scFvs bind 
strongly and specifically to MacI (Figure 4). 



Example 2: Construction and properties of expression vector pBAD-Nl-MCS-H 

The vector pBAD-Nl-MSC-H is based on the expression vector pBAD/Myc-His A 
(Invitrogen Corporation, Carlsbad, CA, USA), and allows the expression of proteins under the 
control of the tightly regulated araB AD promotor. 

The vector pBAD-Nl-MSC-H was constructed by insertion of an expression cassette (311 
bp, Nco I / Hind III fragment) comprising a coding region encoding the Nl domain followed 
by a multiple cloning site (MCS) and a coding region encoding a Hisx6-tag into pBAD/Myc- 
His A digested with Nco I / Hind III (4046 bp). The vector map and sequence of pBAD-Nl- 
MCS-H are shown in Figure 5. 

The advantage of this vector compared to the pTFT vectors (see Examples 1 and 2) is a 
tighter control of fusion protein expression which allows the cloning of potentially toxic 
constructs. Furthermore, no additional cloning step for the transfer from a cloning strain into 
an expression strain is necessary. A disadvantage is that expression yields are sometimes 
lower compared to pTFT vectors. 

Example 3: Expression of fusion proteins comprising the Nl domain of the genelH 
protein 

Cloning of expression vectors. 

The vector used for expression of Nl fusion proteins is the vector pTFT74-Nl-MCS-H 
(Figure 1, complete vector sequence given in appendix) as described in Example 1. Into 
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vector pTFT74-Nl-MCS-H, DNA fragments generated by PCR or made as an oligonucleotide 
cassette coding for (poly)peptides and proteins given in brackets below have been cloned 
either between the unique BsiWI and Hindlll sites or between the unique Xbal and EcoRI 
sites generating vectors pTFT74-Nl-hag (see Example 1), pTFT74-Nl-MacI (see Example 1), 
pTFT74-Nl-Ulfl (Nl fused to fiill-length UL84 of hCMV), pTFT74-Nl-U2 (Nl fused to a 
polypeptide containing amino acids 68 - 586 of UL84 of hCMV), pTFT74-Nl-U4 (Nl fused 
to a polypeptide containing amino acids 300 - 586 of UL84 of hCMV), pTFT74-Nl-Ilfl (Nl 
fused to mature full-length human ICAM-1), pTFT74-Nl-I3 (Nl fused to a polypeptide 
containing amino acids 401 - 480 of human ICAM-1), pTFT74-Nl-I4 (Nl fused to a 
polypeptide containing amino acids 151 -532 of human ICAM-1), pTFT74-Nl-Bl (Nl fused 
to a polypeptide containing amino acids 1 - 198 of a mature human MHC classll beta chain), 
pTFT74-Nl-A14 (Nl fused to a polypeptide containing amino acids 1 - 192 of a mature 
human MHC classll alpha chain) and pTFT74-Nl-Np50 (Nl fused to a polypeptide 
containing amino acids 2 - 366 of human NF-kB p50). All constructs contain a C-terminal 
hexa-histidine tag for affinity purification. 

High-level expression of Nl-fusion proteins. 

Domain Nl of g3p of filamentous bacteriophage Ml 3 can be over-expressed in E. coli, 
purified from inclusion bodies and refolded into active protein (Krebber et al., 1997). 
Different polypeptides were fused C-terminally to Nl and expressed in E. coli leading to 
high-level production and inclusion body formation (Figure 6). In case of Nl-MacI, no 
further purification could be achieved by Ni-NTA chromatography as the inclusion bodies 
contained already almost exclusively Nl-MacI (Figure 6). Surprisingly, all Nl fusion 
proteins (10/10) were soluble after refolding at concentrations of -03 - 1.0 mg/ml using the 
same refolding conditions and the purity was at least 90% (Figure 7). Protein yields were as 
high as 100mg/l of expression culture in case of Nl-MacI and were usually in the range 
between lmg and 10mg/l of expression culture. 
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SEQUENCE LISTING 



<110> Frisch, Christian 
Kretzschmar, Titus 
Hoss, Adolf 
Von Ruden, Thomas 

<120> Generation of specific binding partners binding to (poly)peptides 
encoded by genomic DNA fragments or ESTs 

<130> Morpho/10 

<140> 

<141> 2001-02-28 

<150> PCT/EP00/06137 
<151> 2000-06-3 

<150> EP99 11 2815.8 
<151> 1999-07-02 

<160> 10 

<170> Patentln version 3.0 

<210> 1 
<211> 18 



<212> PRT 

<213> artificial sequence 
<220> 

<221> PEPTIDE 
<222> (1)..(18) 

<223> synthetic expression construct 
<400> 1 

Pro Tyr Asp Val Pro Asp Tyr Ala Ser Leu Arg Ser His His His His 
15 10 15 

His His 

<210> 2 
<211> 7 
<212> PRT 

<213> artificial sequence 
<220> 

<221> PEPTIDE 
<222> (1)..(7) 
<223> synthetic construct 
hemagglutinin epitope 
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<400> 2 

Asp Val Pro Asp Tyr Ala Ser 
1 5 

<210> 3 
<211> 54 
<212> DNA 

<213> artificial sequence 
<220> 

<221> misc_feature 
<222> (1)..(54) 

<223> DNA primer for cloning of an oligonucleotide cassette 
<400> 3 

gtacgacgtt ccagactacg cttccctgcg ttcccatcac catcaccatc acta 

<210> 4 
<211> 54 
<212> DNA 

<213> artificial sequence 
<220> 

<221> misc_feature 
<222> (1)..(54) 

<223> DNA primer for cloning of an oligonucleotide cassette 
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<400> 4 

agcttagtga tggtgatggt gatgggaacg cagggaagcg tagtctggaa cgtc 54 

<210> 5 
<211> 216 
<212> PRT 

<213> artificial sequence 
<220> 

<221> PEPTIDE 
<222> (1)..(216) 
<223> synthetic contsruct 

contains amino acids 149-353 of human CR-3 alpha chain 

<400> 5 

Pro Tyr Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Asp lie Ala Phe 
15 10 15 

Leu lie Asp Gly Ser Gly Ser lie lie Pro His Asp Phe Arg Arg Met 

20 25 30 

Lys Glu Phe Val Ser Thr Val Met Glu Gin Leu Lys Lys Ser Lys Thr 

35 40 45 

Leu Phe Ser Leu Met Gin Tyr Ser Glu Glu Phe Arg lie His Phe Thr 

50 55 60 

Phe Lys Glu Phe Gin Asn Asn Pro Asn Pro Arg Ser Leu Val Lys Pro 
65 70 75 80 
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lie Thr Gin Leu Leu Gly Arg Thr His Thr Ala Thr Gly lie Arg Lys 

85 90 95 

Val Val Arg Glu Leu Phe Asn lie Thr Asn Gly Ala Arg Lys Asn Ala 

100 105 110 

Phe Lys lie Leu Val Val He Thr Asp Gly Glu Lys Phe Gly Asp Pro 

115 120 125 

Leu Gly Tyr Glu Asp Val lie Pro Glu Ala Asp Arg Glu Gly Val He 

130 135 140 

Arg Tyr Val He Gly Val Gly Asp Ala Phe Arg Ser Glu Lys Ser Arg 
145 150 155 160 

Gin Glu Leu Asn Thr lie Ala Ser Lys Pro Pro Arg Asp His Val Phe 

165 170 175 

Gin Val Asn Asn Phe Glu Ala Leu Lys Thr lie Gin Asn Gin Leu Arg 

180 185 190 

Glu Lys lie Phe Ala lie Glu Gly Thr Gin Thr Gly Ser Ser Ser Ser 

195 200 205 

Phe Glu His Glu Met Ser Gin Glu 
210 215 

<210> 6 

<211> 62 

<212> DNA 

<213> artificial sequence 
<220> 

<221> misc_feature 
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<222> (1)..(62) 
<223> synthetic construct 
DNA forward primer 

<400> 6 

gtacgtacgg gggcggctct ggtggtggtt ctggtagtga cattgccttc ttgattgatg 60 
gc 62 

<210> 7 
<211> 69 
<212> DNA 

<213> artificial sequence 
<220> 

<221> misc_feature 
<222> (1)..(69) 
<223> synthetic construct 
DNA reverse primer 

<400> 7 

gtaaagctta gtgatggtga tggtgatgtc taccttcgat ttcctgagac atctcatgct 60 
caaaggagc 69 

<210> 8 

<211> 2869 

<212> DNA 

<213> artificial sequence 
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<220> 

<221> miscjeature 
<222> (1)..(2869) 
<223> synthetic construct 
expression vector 

<400> 8 

acccgacacc atcgaaatta atacgactca ctatagggag accacaacgg tttccctaat 60 
tgtgagcgga taacaataga aataattttg tttaacttta agaaggagat atatccatgg 120 
ctgaaactgt tgaaagttgt ttagcaaaat cccatacaga aaattcattt actaacgtct 1 80 
ggaaagacga caaaacttta gatcgttacg ctaactatga gggctgtctg tggaatgcta 240 
caggcgttgt agtttgtact ggtgacgaaa ctcagtgtta cggtacatgg gttcctattg 300 
ggcttgctat ccctgaaaat gagggtggtg gctctgaggg tggcggttct gagggtggcg 360 
gttctccgta cggctctaga gtcgacgagc tcgatatcgg cggccgcgaa ttctctcatc 420 
accatcacca tcactaagct tcagtcccgg gcagtggatc cggctgctaa caaagcccga 480 
aaggaagctg agttggctgc tgccaccgct gagcaataac tagcataacc ccttggggcc 540 
tctaaacggg tcttgagggg ttttttgctg aaaggaggaa ctatatccgg atcgagatcc 600 
ccacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc gcagcgtgac 660 
cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt cctttctcgc 720 
cacgttcgcc ggctttcccc gtcaagctct aaatcggggc atccctttag ggttccgatt 780 
tagtgcttta cggcacctcg accccaaaaa acttgattag ggtgatggtt cacgtagtgg 840 
gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt tctttaatag 900 
tggactcttg ttccaaactg gaacaacact caaccctatc tcggtctatt cttttgattt 960 
ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt aacaaaaatt 1 020 
taacgcgaat tttaacaaaa tattaacgtt tacaatttca ggtggcactt ttcggggaaa 1 080 
tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt atccgctcat 1 1 40 
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gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta tgagtattca 1200 
acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg tttttgctca 1 260 
cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac gagtgggtta 1 320 
catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg aagaacgttt 1380 
tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc gtattgacgc 1440 
cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg ttgagtactc 1500 
accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat gcagtgctgc 1 560 
cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg gaggaccgaa 1620 
ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg atcgttggga 1 680 
accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc ctgtagcaat 1740 
ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt cccggcaaca 1 800 
attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct cggcccttcc 1 860 
ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc gcggtatcat 1920 
tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca cgacggggag 1980 
tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct cactgattaa 2040 
gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt taaaacttca 21 00 
tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga ccaaaatccc 2160 
ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 2220 
ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 2280 
agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 2340 
cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 2400 
caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 2460 
tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 2520 
ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 2580 
ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 2640 
gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 2700 
gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 2760 
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tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 2820 
cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatg 2869 

<210> 9 
<211> 2865 
<212> DNA 

<213> artificial sequence 
<220> 

<221> misc_feature 
<222> (1)..(2865) 
<223> synthetic construct 
expression vector 

<400> 9 

acccgacacc atcgaaatta atacgactca ctatagggag accacaacgg tttccctaat 60 
tgtgagcgga taacaataga aataattttg tttaacttta agaaggagat atatccatgg 120 
ctcatcacca tcaccatcac gaaactgttg aaagttgttt agcaaaatcc catacagaaa 180 
attcatttac taacgtctgg aaagacgaca aaactttaga tcgttacgct aactatgagg 240 
gctgtctgtg gaatgctaca ggcgttgtag tttgtactgg tgacgaaact cagtgttacg 300 
gtacatgggt tcctattggg cttgctatcc ctgaaaatga gggtggtggc tctgagggtg 360 
gcggttctga gggtggcggt tcttctagag tcgacgagct cgatatcgaa ttcggcggcc 420 
gctaactgac taagcttcag tcccgggcag tggatccggc tgctaacaaa gcccgaaagg 480 
aagctgagtt ggctgctgcc accgctgagc aataactagc ataacccctt ggggcctcta 540 
aacgggtctt gaggggtttt ttgctgaaag gaggaactat atccggatcg agatccccac 600 
gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct 660 
acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt tctcgccacg 720 
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ttcgccggct ttccccgtca agctctaaat cggggcatcc ctttagggtt ccgatttagt 780 
gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg tagtgggcca 840 
tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt taatagtgga 900 
ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt tgatttataa 960 
gggattttgc cgatttcggc ctattggtta aaaaatgagc tgatttaaca aaaatttaac 1020 
gcgaatttta acaaaatatt aacgtttaca atttcaggtg gcacttttcg gggaaatgtg 1080 
cgcggaaccc ctatttgttt atttttctaa atacattcaa atatgtatcc gctcatgaga 1 140 
caataaccct gataaatgct tcaataatat tgaaaaagga agagtatgag tattcaacat 1200 
ttccgtgtcg cccttattcc cttttttgcg gcattttgcc ttcctgtltt tgctcaccca 1260 
gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt gggttacatc 1320 
gaactggatc tcaacagcgg taagatcctt gagagttttc gccccgaaga acgttttcca 1 380 
atgatgagca cttttaaagt tctgctatgt ggcgcggtat tatcccgtat tgacgccggg 1440 
caagagcaac tcggtcgccg catacactat tctcagaatg acttggttga gtactcacca 1500 
gtcacagaaa agcatcttac ggatggcatg acagtaagag aattatgcag tgctgccata 1 560 
accatgagtg ataacactgc ggccaactta cttctgacaa cgatcggagg accgaaggag 1620 
ctaaccgctt ttttgcacaa catgggggat catgtaactc gccttgatcg ttgggaaccg 1680 
gagctgaatg aagccatacc aaacgacgag cgtgacacca cgatgcctgt agcaatggca 1740 
acaacgttgc gcaaactatt aactggcgaa ctacttactc tagcttcccg gcaacaatta 1 800 
atagactgga tggaggcgga taaagttgca ggaccacttc tgcgctcggc ccttccggct 1 860 
ggctggttta ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg tatcattgca 1 920 
gcactggggc cagatggtaa gccctcccgt atcgtagtta tctacacgac ggggagtcag 1 980 
gcaactatgg atgaacgaaa tagacagatc gctgagatag gtgcctcact gattaagcat 2040 
tggtaactgt cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt 2100 
taatttaaaa ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa 2160 
cgtgagtttt cgttccactg agcgtcagac cccgtagaaa agatcaaagg atcttcttga 2220 
gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc gctaccagcg 2280 
gtggtttgtt tgccggatca agagctacca actctttttc cgaaggtaac tggcttcagc 2340 
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agagcgcaga taccaaatac tgtccttcta gtgtagccgt agttaggcca ccacttcaag 2400 
aactctgtag caccgcctac atacctcgct ctgctaatcc tgttaccagt ggctgctgcc 2460 
agtggcgata agtcgtgtct taccgggttg gactcaagac gatagttacc ggataaggcg 2520 
cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg aacgacctac 2580 
accgaactga gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc cgaagggaga 2640 
aaggcggaca ggtatccggt aagcggcagg gtcggaacag gagagcgcac gagggagctt 2700 
ccagggggaa acgcctggta tctttatagt cctgtcgggt ttcgccacct ctgacttgag 2760 
cgtcgatttt tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc cagcaacgcg 2820 
gcctttttac ggttcctggc cttttgctgg ccttttgctc acatg 2865 

<210> 10 
<211> 4357 
<212> DNA 

<213> artificial sequence 
<220> 

<221> miscjeature 
<222> (1)..(4357) 
<223> synthetic construct 
expression vector 

<400> 10 

aagaaaccaa ttgtccatat tgcatcagac attgccgtca ctgcgtcttt tactggctct 60 
tctcgctaac caaaccggta accccgctta ttaaaagcat tctgtaacaa agcgggacca 120 
aagccatgac aaaaacgcgt aacaaaagtg tctataatca cggcagaaaa gtccacattg 1 80 
attatttgca cggcgtcaca ctttgctatg ccatagcatt tttatccata agattagcgg 240 
atcctacctg acgcttttta tcgcaactct ctactgtttc tccatacccg tttttttggg 300 
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ctaacaggag gaattaacca tggctgaaac tgttgaaagt tgtttagcaa aatcccatac 360 
agaaaattca tttactaacg tctggaaaga cgacaaaact ttagatcgtt acgctaacta 420 
tgagggctgt ctgtggaatg ctacaggcgt tgtagtttgt actggtgacg aaactcagtg 480 
ttacggtaca tgggttccta ttgggcttgc tatccctgaa aatgagggtg gtggctctga 540 
gggtggcggt tctgagggtg gcggttctag agtcgacgag ctcgatatcg gcggccgcga 600 
attctctcat caccatcacc atcactaagc ttgggcccga acaaaaactc atctcagaag 660 
aggatctgaa tagcgccgtc gaccatcatc atcatcatca ttgagtttaa acggtctcca 720 
gcttggctgt tttggcggat gagagaagat tttcagcctg atacagatta aatcagaacg 780 
cagaagcggt ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga 840 
ccccatgccg aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca 900 
tgcgagagta gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg 960 
cctttcgttt tatctgttgt ttgtcggtga acgctctcct gagtaggaca aatccgccgg 1 020 
gagcggattt gaacgttgcg aagcaacggc ccggagggtg gcgggcagga cgcccgccat 1 080 
aaactgccag gcatcaaatt aagcagaagg ccatcctgac ggatggcctt tttgcgtttc 1 140 
tacaaactct ttttgtttat ttttctaaat acattcaaat atgtatccgc tcatgagaca 1 200 
ataaccctga taaatgcttc aataatattg aaaaaggaag agtatgagta ttcaacattt 1260 
ccgtgtcgcc cttattccct tttttgcggc attttgcctt cctgtttttg ctcacccaga 1 320 
aacgctggtg aaagtaaaag atgctgaaga tcagttgggt gcacgagtgg gttacatcga 1380 
actggatctc aacagcggta agatccttga gagttttcgc cccgaagaac gttttccaat 1440 
gatgagcact tttaaagttc tgctatgtgg cgcggtatta tcccgtgttg acgccgggca 1 500 
agagcaactc ggtcgccgca tacactattc tcagaatgac ttggttgagt actcaccagt 1 560 
cacagaaaag catcttacgg atggcatgac agtaagagaa ttatgcagtg ctgccataac 1620 
catgagtgat aacactgcgg ccaacttact tctgacaacg atcggaggac cgaaggagct 1 680 
aaccgctttt ttgcacaaca tgggggatca tgtaactcgc cttgatcgtt gggaaccgga 1740 
gctgaatgaa gccataccaa acgacgagcg tgacaccacg atgcctgtag caatggcaac 1800 
aacgttgcgc aaactattaa ctggcgaact acttactcta gcttcccggc aacaattaat 1860 
agactggatg gaggcggata aagttgcagg accacttctg cgctcggccc ttccggctgg 1 920 
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ctggtttatt gctgataaat ctggagccgg tgagcgtggg tctcgcggta tcattgcagc 1980 
actggggcca gatggtaagc cctcccgtat cgtagttatc tacacgacgg ggagtcaggc 2040 
aactatggat gaacgaaata gacagatcgc tgagataggt gcctcactga ttaagcattg 21 00 
gtaactgtca gaccaagttt actcatatat actttagatt gatttaaaac ttcattttta 21 60 
atttaaaagg atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg 2220 
tgagttttcg ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga 2280 
tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt 2340 
ggtttgtttg ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag 2400 
agcgcagata ccaaatactg tccttctagt gtagccgtag ttaggccacc acttcaagaa 2460 
ctctgtagca ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag 2520 
tggcgataag tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca 2580 
gcggtcgggc tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac 2640 
cgaactgaga tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa 2700 
ggcggacagg tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc 2760 
agggggaaac gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg 2820 
tcgatttttg tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc 2880 
ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc ctgcgttatc 2940 
ccctgattct gtggataacc gtattaccgc ctttgagtga gctgataccg ctcgccgcag 3000 
ccgaacgacc gagcgcagcg agtcagtgag cgaggaagcg gaagagcgcc tgatgcggta 3060 
ttttctcctt acgcatctgt gcggtatttc acaccgcata tggtgcactc tcagtacaat 31 20 
ctgctctgat gccgcatagt taagccagta tacactccgc tatcgctacg tgactgggtc 3 1 80 
atggctgcgc cccgacaccc gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc 3240 
ccggcatccg cttacagaca agctgtgacc gtctccggga gctgcatgtg tcagaggttt 3300 
tcaccgtcat caccgaaacg cgcgaggcag cagatcaatt cgcgcgcgaa ggcgaagcgg 3360 
catgcataat gtgcctgtca aatggacgaa gcagggattc tgcaaaccct atgctactcc 3420 
gtcaagccgt caattgtctg attcgttacc aattatgaca acttgacggc tacatcattc 3480 
actttttctt cacaaccggc acggaactcg ctcgggctgg ccccggtgca ttttttaaat 3540 
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acccgcgaga aatagagttg atcgtcaaaa ccaacattgc gaccgacggt ggcgataggc 3600 
atccgggtgg tgctcaaaag cagcttcgcc tggctgatac gttggtcctc gcgccagctt 3660 
aagacgctaa tccctaactg ctggcggaaa agatgtgaca gacgcgacgg cgacaagcaa 3720 
acatgctgtg cgacgctggc gatatcaaaa ttgctgtctg ccaggtgatc gctgatgtac 3780 
tgacaagcct cgcgtacccg attatccatc ggtggatgga gcgactcgtt aatcgcttcc 3840 
atgcgccgca gtaacaattg ctcaagcaga tttatcgcca gcagctccga atagcgccct 3900 
tccccttgcc cggcgttaat gatttgccca aacaggtcgc tgaaatgcgg ctggtgcgct 3960 
tcatccgggc gaaagaaccc cgtattggca aatattgacg gccagttaag ccattcatgc 4020 
cagtaggcgc gcggacgaaa gtaaacccac tggtgatacc attcgcgagc ctccggatga 4080 
cgaccgtagt gatgaatctc tcctggcggg aacagcaaaa tatcacccgg tcggcaaaca 4140 
aattctcgtc cctgattttt caccaccccc tgaccgcgaa tggtgagatt gagaatataa 4200 
cctttcattc ccagcggtcg gtcgataaaa aaatcgagat aaccgttggc ctcaatcggc 4260 
gttaaacccg ccaccagatg ggcattaaac gagtatcccg gcagcagggg atcattttgc 4320 
gcttcagcca tacttttcat actcccgcca ttcagag 4357 
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